========================================================
Prosper.com is a San Francisco, California-based company involved in peer-to-peer lending of money to various borrowers who meet certain conditions Borrowers make loan requests and investors contribute as little as $25 towards the loans of their choice based on the credit risk they would like to take and investment returns they expect from the loans
## 'data.frame': 113937 obs. of 81 variables:
## $ ListingKey : Factor w/ 113066 levels "00003546482094282EF90E5",..: 7180 7193 6647 6669 6686 6689 6699 6706 6687 6687 ...
## $ ListingNumber : int 193129 1209647 81716 658116 909464 1074836 750899 768193 1023355 1023355 ...
## $ ListingCreationDate : Factor w/ 113064 levels "2005-11-09 20:44:28.847000000",..: 14184 111894 6429 64760 85967 100310 72556 74019 97834 97834 ...
## $ CreditGrade : Factor w/ 9 levels "","A","AA","B",..: 5 1 8 1 1 1 1 1 1 1 ...
## $ Term : int 36 36 36 36 36 60 36 36 36 36 ...
## $ LoanStatus : Factor w/ 12 levels "Cancelled","Chargedoff",..: 3 4 3 4 4 4 4 4 4 4 ...
## $ ClosedDate : Factor w/ 2803 levels "","2005-11-25 00:00:00",..: 1138 1 1263 1 1 1 1 1 1 1 ...
## $ BorrowerAPR : num 0.165 0.12 0.283 0.125 0.246 ...
## $ BorrowerRate : num 0.158 0.092 0.275 0.0974 0.2085 ...
## $ LenderYield : num 0.138 0.082 0.24 0.0874 0.1985 ...
## $ EstimatedEffectiveYield : num NA 0.0796 NA 0.0849 0.1832 ...
## $ EstimatedLoss : num NA 0.0249 NA 0.0249 0.0925 ...
## $ EstimatedReturn : num NA 0.0547 NA 0.06 0.0907 ...
## $ ProsperRating..numeric. : int NA 6 NA 6 3 5 2 4 7 7 ...
## $ ProsperRating..Alpha. : Factor w/ 8 levels "","A","AA","B",..: 1 2 1 2 6 4 7 5 3 3 ...
## $ ProsperScore : num NA 7 NA 9 4 10 2 4 9 11 ...
## $ ListingCategory..numeric. : int 0 2 0 16 2 1 1 2 7 7 ...
## $ BorrowerState : Factor w/ 52 levels "","AK","AL","AR",..: 7 7 12 12 25 34 18 6 16 16 ...
## $ Occupation : Factor w/ 68 levels "","Accountant/CPA",..: 37 43 37 52 21 43 50 29 24 24 ...
## $ EmploymentStatus : Factor w/ 9 levels "","Employed",..: 9 2 4 2 2 2 2 2 2 2 ...
## $ EmploymentStatusDuration : int 2 44 NA 113 44 82 172 103 269 269 ...
## $ IsBorrowerHomeowner : Factor w/ 2 levels "False","True": 2 1 1 2 2 2 1 1 2 2 ...
## $ CurrentlyInGroup : Factor w/ 2 levels "False","True": 2 1 2 1 1 1 1 1 1 1 ...
## $ GroupKey : Factor w/ 707 levels "","00343376901312423168731",..: 1 1 335 1 1 1 1 1 1 1 ...
## $ DateCreditPulled : Factor w/ 112992 levels "2005-11-09 00:30:04.487000000",..: 14347 111883 6446 64724 85857 100382 72500 73937 97888 97888 ...
## $ CreditScoreRangeLower : int 640 680 480 800 680 740 680 700 820 820 ...
## $ CreditScoreRangeUpper : int 659 699 499 819 699 759 699 719 839 839 ...
## $ FirstRecordedCreditLine : Factor w/ 11586 levels "","1947-08-24 00:00:00",..: 8639 6617 8927 2247 9498 497 8265 7685 5543 5543 ...
## $ CurrentCreditLines : int 5 14 NA 5 19 21 10 6 17 17 ...
## $ OpenCreditLines : int 4 14 NA 5 19 17 7 6 16 16 ...
## $ TotalCreditLinespast7years : int 12 29 3 29 49 49 20 10 32 32 ...
## $ OpenRevolvingAccounts : int 1 13 0 7 6 13 6 5 12 12 ...
## $ OpenRevolvingMonthlyPayment : num 24 389 0 115 220 1410 214 101 219 219 ...
## $ InquiriesLast6Months : int 3 3 0 0 1 0 0 3 1 1 ...
## $ TotalInquiries : num 3 5 1 1 9 2 0 16 6 6 ...
## $ CurrentDelinquencies : int 2 0 1 4 0 0 0 0 0 0 ...
## $ AmountDelinquent : num 472 0 NA 10056 0 ...
## $ DelinquenciesLast7Years : int 4 0 0 14 0 0 0 0 0 0 ...
## $ PublicRecordsLast10Years : int 0 1 0 0 0 0 0 1 0 0 ...
## $ PublicRecordsLast12Months : int 0 0 NA 0 0 0 0 0 0 0 ...
## $ RevolvingCreditBalance : num 0 3989 NA 1444 6193 ...
## $ BankcardUtilization : num 0 0.21 NA 0.04 0.81 0.39 0.72 0.13 0.11 0.11 ...
## $ AvailableBankcardCredit : num 1500 10266 NA 30754 695 ...
## $ TotalTrades : num 11 29 NA 26 39 47 16 10 29 29 ...
## $ TradesNeverDelinquent..percentage. : num 0.81 1 NA 0.76 0.95 1 0.68 0.8 1 1 ...
## $ TradesOpenedLast6Months : num 0 2 NA 0 2 0 0 0 1 1 ...
## $ DebtToIncomeRatio : num 0.17 0.18 0.06 0.15 0.26 0.36 0.27 0.24 0.25 0.25 ...
## $ IncomeRange : Factor w/ 8 levels "$0","$1-24,999",..: 4 5 7 4 3 3 4 4 4 4 ...
## $ IncomeVerifiable : Factor w/ 2 levels "False","True": 2 2 2 2 2 2 2 2 2 2 ...
## $ StatedMonthlyIncome : num 3083 6125 2083 2875 9583 ...
## $ LoanKey : Factor w/ 113066 levels "00003683605746079487FF7",..: 100337 69837 46303 70776 71387 86505 91250 5425 908 908 ...
## $ TotalProsperLoans : int NA NA NA NA 1 NA NA NA NA NA ...
## $ TotalProsperPaymentsBilled : int NA NA NA NA 11 NA NA NA NA NA ...
## $ OnTimeProsperPayments : int NA NA NA NA 11 NA NA NA NA NA ...
## $ ProsperPaymentsLessThanOneMonthLate: int NA NA NA NA 0 NA NA NA NA NA ...
## $ ProsperPaymentsOneMonthPlusLate : int NA NA NA NA 0 NA NA NA NA NA ...
## $ ProsperPrincipalBorrowed : num NA NA NA NA 11000 NA NA NA NA NA ...
## $ ProsperPrincipalOutstanding : num NA NA NA NA 9948 ...
## $ ScorexChangeAtTimeOfListing : int NA NA NA NA NA NA NA NA NA NA ...
## $ LoanCurrentDaysDelinquent : int 0 0 0 0 0 0 0 0 0 0 ...
## $ LoanFirstDefaultedCycleNumber : int NA NA NA NA NA NA NA NA NA NA ...
## $ LoanMonthsSinceOrigination : int 78 0 86 16 6 3 11 10 3 3 ...
## $ LoanNumber : int 19141 134815 6466 77296 102670 123257 88353 90051 121268 121268 ...
## $ LoanOriginalAmount : int 9425 10000 3001 10000 15000 15000 3000 10000 10000 10000 ...
## $ LoanOriginationDate : Factor w/ 1873 levels "2005-11-15 00:00:00",..: 426 1866 260 1535 1757 1821 1649 1666 1813 1813 ...
## $ LoanOriginationQuarter : Factor w/ 33 levels "Q1 2006","Q1 2007",..: 18 8 2 32 24 33 16 16 33 33 ...
## $ MemberKey : Factor w/ 90831 levels "00003397697413387CAF966",..: 11071 10302 33781 54939 19465 48037 60448 40951 26129 26129 ...
## $ MonthlyLoanPayment : num 330 319 123 321 564 ...
## $ LP_CustomerPayments : num 11396 0 4187 5143 2820 ...
## $ LP_CustomerPrincipalPayments : num 9425 0 3001 4091 1563 ...
## $ LP_InterestandFees : num 1971 0 1186 1052 1257 ...
## $ LP_ServiceFees : num -133.2 0 -24.2 -108 -60.3 ...
## $ LP_CollectionFees : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_GrossPrincipalLoss : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_NetPrincipalLoss : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_NonPrincipalRecoverypayments : num 0 0 0 0 0 0 0 0 0 0 ...
## $ PercentFunded : num 1 1 1 1 1 1 1 1 1 1 ...
## $ Recommendations : int 0 0 0 0 0 0 0 0 0 0 ...
## $ InvestmentFromFriendsCount : int 0 0 0 0 0 0 0 0 0 0 ...
## $ InvestmentFromFriendsAmount : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Investors : int 258 1 41 158 20 1 1 1 1 1 ...
I will examine this dataset by considering the two major factors that make the platform successful; the borrowers and the lenders. I will explore the factors that make a lender a good candidate for lending money to and what the lender expects as a return of the money invested in the platform.
## AK AL AR AZ CA CO CT DC DE FL GA
## 5515 200 1679 855 1901 14717 2210 1627 382 300 6720 5008
## HI IA ID IL IN KS KY LA MA MD ME MI
## 409 186 599 5921 2078 1062 983 954 2242 2821 101 3593
## MN MO MS MT NC ND NE NH NJ NM NV NY
## 2318 2615 787 330 3084 52 674 551 3097 472 1090 6729
## OH OK OR PA RI SC SD TN TX UT VA VT
## 4197 971 1817 2972 435 1122 189 1737 6842 877 3278 207
## WA WI WV WY
## 3048 1842 391 150
The state with the highest amount of borrowers is Carlifornia, there are close to 15,000 borrowers in the state, the states following are Texas, Newyork and Florida. The amounts in these closest states are just within the halfway point of Carlifornia. These four states also have the 4 highest populations in the country so its not suprising that the majority of borrowers are from these states. There are also 5500 borrowers whose states are not listed.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1000 4000 6500 8337 12000 35000
The minimum loan amount is 1000 dollars, the maximum is 35,000 dollars and the average loan amount is 8,337 dollars. the median stands at 6,500 dollars. but there is a huge spike at 4000 dollar mark where the majority of the loans issued are, there are other spikes at the 10,000 and 15,000 dollar level
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 12.00 36.00 36.00 40.83 36.00 60.00
Looks like Prosper mostly uses the 3 year term to lend money, more than 3/4 of the loans were 36 month loans compared to a little over 1/5 of the loans which were 5 year loans. Longer terms loans may provide more return on investment but they tend to be more prone to delinquency so a 3 year loan is a safer bet.
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.653 15.629 20.976 21.883 28.381 51.229 25
The borrower APR is usually based on credit worthiness and the risk that the borrowers have on the lender, expressed as a percentage, I took immediate notice to the 36% APR, there are over 10,000 loans where the Apr above 35%. I wonder how the borrowers are able to manage to pay back with this much interest.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0 131.6 217.7 272.5 371.6 2251.5
A look at the borrowers monthly payment shows an average of $272.5, there is a sharp spike at the $175 point which would corresponds with the $4000 loan for a 3 year term,I also noted 8 outliers for people paying over $2000 a month
There are low figures in the years of Prosper launch.There is also a slump in lending activity in the years 2008 and 2009, this corresponds to the 2008 financial crisis that hit the country and led to a reccession, the hit in the financial sector led to almost no loans being issued, as the economy recovered from the reccession in 2012, lending numbers started climbing and ultimately spiked in 2013
The majority of the loans are in good standing. These loans amounts to 83% which include loans that are current and have been competed. There are close to 12% of the loans that are charged off. I would think that is not that bad since the data is spread through an 8 year period.
The majority of borrowers in this category have no delinquncies within the last 7 years. I however, notice that a quarter of the loans are held by borrowers with more than one delinquecy in the said 7 years. It looks like prosper does a good amount of second chance lending to people who are delinquent and may not be extended loans somewhere else.
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.000 0.310 0.600 0.561 0.840 5.950 7604
Bankcard Utilization is considered to determine the risk, the higher the card utilization would indicate that the lender is using too much of their available credit on their available credit lines. There are some notable numbers about 1.5% that is using more than their available credit limit.
## Employed Full-time Not available Not employed
## 2255 67322 26355 5347 835
## Other Part-time Retired Self-employed
## 3806 1088 795 6134
Having a job so that borrowers can manage to pay back the loans is one of the most important consideration in lending decisions, so there are no suprises here that less than 1% of the borrowers are unemployed.
Borrowers income is one of the most important factor in consideration of lending. There need to be proof that the borrower has the ability to pay back the loan. It looks like Prosper mainly considers borrowers within the $25,000 to $75000 income range. More than half of the borrowers fall under this range.
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.00 14.00 22.00 27.59 32.00 1001.00 8554
This number is one way lenders measure the borrowers ability to manage their payments to the money they have borrowed. A low ratio shows that a borrower is well equiped to pay back what has been borrowed. I noticed that there are a few borrowers with over 75% debt to income ratio, that would almost make it impossible to repay the loan.
##
## 1 2 3 4 5 6 7 8 9 10 11
## 992 5766 7642 12595 9813 12278 10597 12053 6911 4750 1456
The custom prosper score is a risk score built using historical Prosper data to assess the risk of Prosper borrower listings. The output to Prosper users is a Prosper score which ranges from 1 to 11, with 11 being the best, or lowest risk while 1 is the lowest thus highest risk. the majority of borrowers are at number 4 the least amount of borrowers are at number 1
## A AA B C D E HR
## 29084 14551 5372 15581 18345 14274 9795 6935
Prosper also uses a custom alphabetical score to fund the loans, the best rate is AA, and it moves alphabetically from A to E, then the lowest score is HR. NA values have the most numbers since they have not been awarded an alphabetical score, rating C has the most borrowers and AA has the least, it seems that prosper on generally have borrowers within the score C, which is the average score
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 9.5 669.5 689.5 695.1 729.5 889.5 591
Credit scores are used by lenders to assess how risky the borrower is, any score that is below 550 is considered BAD, under 650 POOR, under 700 FAIR, under 750 GOOD and any score above 750 as EXCELLENT. The data provided has the lower and upper limit score which i combine to get the average score. The distribution is skewed to the right.
Although having many open lines of credit generally improves lenders’ credit score, too many open lines of credit might indicate some financial instability. There’s about 4% of borrowers that have over 20 open credit lines. In lending considerations, i would think that would be in the higher risk pool. There are spikes 7 to 10 credit lines, a slump between 10-15 and interestingly the spike also picks up at 15 credit lines. Also notable is that close to 6% have over 20 open lines of credit, i wonder how one is able to manage over 20 lines of credit.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 2.00 44.00 80.48 115.00 1189.00
Prosper lends money from a pool of people who invest their money for a profit, to spread the risk the an individual loan is funded by a group of investors, while close to a quarter of the loans are funded by a single person, most loans have less than 50 or more investors, around 47% of the loans are funded by more than 50 investors
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -1.00 12.42 17.30 18.27 24.00 49.25
Investors love to make more money on their investments, and by taking more risk they get more reward, the more risky loans have an eye popping return of over 30%, more than 9% of the loans have this return on investment, theres a huge spike at the 32% interest mark. this could be attributed to the subprime borrowers. There are also some loans returning negave yields which indicate that some lenders lost money on some loans, these are the loans that have defaulted and charged off.
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.00 15.00 22.00 23.23 30.00 126.00 7544
Investors in the prosper platform buy and sell trades on Prosper for a return on their investments, the median for the number of trades is 22. There are however traders having as high as 126 trades
Originally, there are 113,937 loan records with 81 features or variables. I picked a combination of various features which i thought would give a good representation of the dataset and the relationships that exist between the variables
The ProsperLoan dataset contains 81 variables about 113937 loans made through the prosper.com marketplace. The loans cover the period 2005-11-15, 2014-03-12
Prosper marketplace uses investors who avail their money to be borrowed by prospective borrowers in the marketplace. The main interest of the dataset is that close to a quarter of the loans have been funded by one person. when investors put their money in Prosper they expect a good return on their money and part of mitigating the loss is to spread the loans among several lenders, by being the single lender it would mean that if the borrower defaulted on the loan the investor would lose money.
The provision of the Prosper’s internal credit rating, both alphabetical and numerical on top of the borrowers credit score assists in establishing the borrowers risk, These scores also are used in the analysis of the dataset.
yes, since the income is listed in ranges, i created a variable to get an ordered list of the income.also the credit score prosper uses has two sets of credit scores that the lender use, the lower and upper range, i combined these two scores to get a new variable Average_credit, to use as the average score.
I did not notice any meaningful unusual distributions, but I noticed the rate of interest for close to 20% of borrowers being over 30%, The alphabetical prosper rating also has more than 25% of the entries as NA, there are other scores prosper uses that have more accurate representation of the dataset.
Before i look analyze the bivarate section, I will first calculate and display the correlation between the various elements that i will explore
##
## Pearson's product-moment correlation
##
## data: prosper_data$LoanOriginalAmount and prosper_data$Term
## t = 121.6, df = 113940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.3337778 0.3440569
## sample estimates:
## cor
## 0.3389275
The amount borrowed goes up as well as the median as the length ot the loan increases.there is a positive correlation between the amount borrowed and the loan term
Here we see that most loans are issued within the prosper score of around 4, the borrower rate falls as the prosper score rises
The start of the lending activity was low, it picked up steam only to fall in 2008 and 2009, it sot right back up to the highest lending amounts in 2013 and 2014
Here we see the spikes in 2012/2013 years, the largest amount borrowed are in the first quarter of 2014, there is a slump in lending in the 2008/2009 quarters, these trends coincide with the economic recession and the recovery within the US economy
There is a sharp drop in the median for the Employed category vs the Full time employed, this detail might speak to the trend that alot of people no longer have fulltime positions but rather parttime and contracts especially in the gig economy
The best rate is AA, and it moves alphabetically from A to E, then the lowest score is HR, from the plot we can clearly see that the lower the Prosper rating the higher the yield from the loan
Having removed most of the outliers from this boxplot, we see that the median for the full time employed borrowers closely coincides with the not employed, that would indicate that the unemployed are taking high loans probably to cover their daily expenses, I wonder how Prosper ensures that they pay back since they are not employed.
##
## Pearson's product-moment correlation
##
## data: prosper_data$BorrowerRate and prosper_data$DebtToIncomeRatio
## t = 20.465, df = 105380, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.05690080 0.06892819
## sample estimates:
## cor
## 0.06291678
From this plot, we see that the borrower rate range is at 5 to 30% has the concetration of the debt to income ratio of 0 to 50 %, th e correlation is positive, at .063
The loan original amount is highest between 6 and 15 credit lines, most of the lending activity happens under $10,000 across all current credit lines
##
## Pearson's product-moment correlation
##
## data: prosper_data$LoanOriginalAmount and prosper_data$BankcardUtilization
## t = -11.088, df = 106330, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.03998678 -0.02797954
## sample estimates:
## cor
## -0.03398438
Comparing the bank card utilization and loan amount, we see that majority of the loans are concenrated inder the 10,000 range and most borrowers’ utilization is under 1 but theres alot of concentration right around the 1.0 mark which translates to around 100% utilization.there is no natable correlation bwtween these 2 factors (-0.03398438 )
##
## Pearson's product-moment correlation
##
## data: prosper_data$LoanOriginalAmount and prosper_data$DebtToIncomeRatio
## t = 3.2828, df = 105380, p-value = 0.001028
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.004074882 0.016148830
## sample estimates:
## cor
## 0.01011222
From this visual, we observe that the borrower interest rate is concentrated within the under 40%, compared to the debt to income ratio, majority of the borrowers are within the 50-100% debt to income ratio, the correlation coefficient is .01
##
## Pearson's product-moment correlation
##
## data: prosper_data$CreditScoreRangeLower and prosper_data$EstimatedReturn
## t = -107.5, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.3521408 -0.3402970
## sample estimates:
## cor
## -0.3462327
It doesnt seem like the factors explored here are affecting each other, the correlation coefficient is -0.346
There is a positive correlation between the borrower rate and bank card utilization, as the utilization rise there is also a rise in the interest rate because a higher utilization may tend to lower credit scores thus higher interest rate.
##
## Pearson's product-moment correlation
##
## data: prosper_data$BorrowerRate and prosper_data$CreditScoreRangeLower
## t = -175.17, df = 113340, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.4661358 -0.4569730
## sample estimates:
## cor
## -0.4615667
There is a negative correlation between the credit score and borrower rate, higher credit scores afford borrowers a lower interest rate
Prosper’s inhouse rating both alphabetical and numerical largely coincides with the credit scores of the borrowers, the Prosper scores are largely based on the history of the borrower and his/her credit history, although its broken down in other factors like bankcard utilization and delinquencies in the last 7 years.
The main feature of the dataset that stand out is that more than half of borrowers use the money for debt consolidation, the majority of the people take out loans to pay other loans.
The strongest relationship that i encountered in the dataset came from the lender yield and and borrower rate, the corelation in these two factors is at 0.99
The comparison between loan original amount and monthly payments indicate that monthly payments are high when the term is short, borrowers who take a shorter time to pay off their loan will pay a higher amount monthly so the loan is paid off faster and gennerally will incur less expense in terms of interest.
we can observe here that the higher Prosper scores are concentrated around higher credit scoresin the high 700 to 900, these scores also accord borrowers the lowest interest rates right around 10% and less, the opposite is the same for lower scores, HR scores equate to lower credit scores and higher interest rates.
Here we take a look at the credit score(upper limit), borrower rate and prosper score. we notice that the best prosper score settles around higher credit scores and lower borrower rates
The prosper score here again compared to the borower rate and delinquencies over the last 7 years depict that low borrower rates and high prosper scores also tend to have low delinquencies. These would be considered low risk borrowers
From this comparisson we observe that loan amounts are not directly affected by Prosper’s alphabetic rating.the borrowers rate rises as the Alpha score gets better.
The majority of the loans are in the darker portion of the credit lines visual which is 15 and under, the borrower rate is affected by the prosper rating, as the rating improve the borrower rate falls
The loan amount borrowed spreads evenly across the Borrower rate and Debt to income ratio, these two factors dont seem to affect the loan amount borrowed
It does not seem that the prosper score or credit score affects the loan amount borrowed
This plot shows that the employed and full time employed get the most loans, the loan amount borrowed doesn’t seem to be affected by credit scores but employment is a major factor.
Loan original amount is not affected by the number of credit lines the borrower has open, credit scores also dont affect the amount borrowed
High credit scores correlate with a higher prosper rating, most of higher credit ratings also have a characteristic of less than 20 current open credit lines
Most relationships in this section reinforce what we had already discovered in the univariate and bivariate sections. as the credit risk increases the borrower rate also increases.
Comparing the 12,24 or 36 month loans it shows that Original loan amounts at the 24 and 36 months are higher than the loans at the 12 month term. The distribution for the 24/36 term is almost identical.
No models were created, the dataset factors were suffcient for the exploration that i did.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0 131.6 217.7 272.5 371.6 2251.5
This plot shows the distributon of the monthly payments across various loan terms, there is that very significant spike at around 175 dollars, not only does this plot show the most common monthly payment, after some exploration it also reveals the most common Loan amout at $4000 and translate to the 36 month term where the majority of the Borrowers are. I chose this plot because it summarizes the dataset by capturing the dorminant amount borrowed, monthly payment and depicts the most common loan term
This plot goes a long way to show the relation between the Prosper rating and The lender yield across various loans. As the Prosper rating goes down, the lender yield goes up, Pretty much summarizes the Dataset and the lending and borrowing relationship. I chose this plot because prosper succeeds when Investors bring their money to be borrowed, they expect a return on their money, Prosper also uses a rating method to access the risk of their borrowers, the plot summarizes the relationship between the borrower side and the lender and shows how the investors return relies on the borrowers rating
This plot depicts the Borrower rate’s relationship with the credit score and the Prosper Rating Alpha, it perfectly summarizes how prosper borrower rate is about, an inhouse alphabetical rating is founded on the borrowers credit score, to get the Borrower Rate, the lower the borrower risk the higher their score. I chose this plot because it depicts how risk is accessed and the borrower rate determined, thus the success of the platform
The struggle i went through when analyzing this dataset is that this dataset has alot variables that could potentially be explored, its not possible to explore all 81 variables, but the variables I explored gives a pretty good picture of the dataset.It took a while to settle on what variables to explore and I even tried some that i eliminated since i didnt feel that they would tell the story as i wanted. There was also what to seems to be alot of repetition of variables that seek to measure a particular factor, for example there is an upper and lower credit score, properRating, Prosper ratingAlpha, credit Grade, borrower rate and borrower APR all used to measure credit worthiness.I had to do some research into the prosper lending business model to familiarize myself with the dataset.So the analysis took alot of time to get acquainted with. The over 30% interest rate suprised me because i think that with that kind of interest, people are bound to default because their payments dont seem to go far in paying off their original loan amount. The majority of the loans are used to pay off other debts, this is an area that opens up a possible area of exploration, the debt crisis in the country whereby people have so much debt , they take out more debts to pay them off. according to a MarketWatch article, U.S. households collectively have more than $1 trillion in credit-card debt in 2017. Thats a growth rate of 4.9%, when people are struggling with such a heavy burden of debt, they may not advance financially and the economy does not do as well. What facors that that we might change as a society is a potential area of some more indepth analysis.